92 research outputs found

    Functional Baby Talk: Analysis of Code Fragments from Novice Haskell Programmers

    Get PDF
    What kinds of mistakes are made by novice Haskell developers, as they learn about functional programming? Is it possible to analyze these errors in order to improve the pedagogy of Haskell? In 2016, we delivered a massive open online course which featured an interactive code evaluation environment. We captured and analyzed 161K interactions from learners. We report typical novice developer behavior; for instance, the mean time spent on an interactive tutorial is around eight minutes. Although our environment was restricted, we gain some understanding of Haskell novice errors. Parenthesis mismatches, lexical scoping errors and do block misunderstandings are common. Finally, we make recommendations about how such beginner code evaluation environments might be enhanced

    Algorithmic skeletons for exact combinatorial search at scale

    Get PDF
    Exact combinatorial search is essential to a wide range of application areas including constraint optimisation, graph matching, and computer algebra. Solutions to combinatorial problems are found by systematically exploring a search space, either to enumerate solutions, determine if a specific solution exists, or to find an optimal solution. Combinatorial searches are computationally hard both in theory and practice, and efficiently exploring the huge number of combinations is a real challenge, often addressed using approximate search algorithms. Alternatively, exact search can be parallelised to reduce execution time. However, parallel search is challenging due to both highly irregular search trees and sensitivity to search order, leading to anomalies that can cause unexpected speedups and slowdowns. As core counts continue to grow, parallel search becomes increasingly useful for improving the performance of existing searches, and allowing larger instances to be solved. A high-level approach to parallel search allows non-expert users to benefit from increasing core counts. Algorithmic Skeletons provide reusable implementations of common parallelism patterns that are parameterised with user code which determines the specific computation, e.g. a particular search. We define a set of skeletons for exact search, requiring the user to provide in the minimal case a single class that specifies how the search tree is generated and a parameter that specifies the type of search required. The five are: Sequential search; three general-purpose parallel search methods: Depth-Bounded, Stack-Stealing, and Budget; and a specific parallel search method, Ordered, that guarantees replicable performance. We implement and evaluate the skeletons in a new C++ parallel search framework, YewPar. YewPar provides both high-level skeletons and low-level search specific schedulers and utilities to deal with the irregularity of search and knowledge exchange between workers. YewPar is based on the HPX library for distributed task-parallelism potentially allowing search to execute on multi-cores, clusters, cloud, and high performance computing systems. Underpinning the skeleton design is a novel formal model, MT^3 , a parallel operational semantics that describes multi-threaded tree traversals, allowing reasoning about parallel search, e.g. describing common parallel search phenomena such as performance anomalies. YewPar is evaluated using seven different search applications (and over 25 specific instances): Maximum Clique, k-Clique, Subgraph Isomorphism, Travelling Salesperson, Binary Knapsack, Enumerating Numerical Semigroups, and the Unbalanced Tree Search Benchmark. The search instances are evaluated at multiple scales from 1 to 255 workers, on a 17 host, 272 core Beowulf cluster. The overheads of the skeletons are low, with a mean 6.1% slowdown compared to hand-coded sequential implementation. Crucially, for all search applications YewPar reduces search times by an order of magnitude, i.e hours/minutes to minutes/seconds, and we commonly see greater than 60% (average) parallel efficiency speedups for up to 255 workers. Comparing skeleton performance reveals that no one skeleton is best for all searches, highlighting a benefit of a skeleton approach that allows multiple parallelisations to be explored with minimal refactoring. The Ordered skeleton avoids slowdown anomalies where, due to search knowledge being order dependent, a parallel search takes longer than a sequential search. Analysis of Ordered shows that, while being 41% slower on average (73% worse-case) than Depth-Bounded, in nearly all cases it maintains the following replicable performance properties: 1) parallel executions are no slower than one worker sequential executions 2) runtimes do not increase as workers are added, and 3) variance between repeated runs is low. In particular, where Ordered maintains a relative standard deviation (RSD) of less than 15%, Depth-Bounded suffers from an RSD greater than 50%, showing the importance of carefully controlling search orders for repeatability

    Towards Generic Scalable Parallel Combinatorial Search

    Get PDF
    Combinatorial search problems in mathematics, e.g. in finite geometry, are notoriously hard; a state-of-the-art backtracking search algorithm can easily take months to solve a single problem. There is clearly demand for parallel combinatorial search algorithms scaling to hundreds of cores and beyond. However, backtracking combinatorial searches are challenging to parallelise due to their sensitivity to search order and due to the their irregularly shaped search trees. Moreover, scaling parallel search to hundreds of cores generally requires highly specialist parallel programming expertise. This paper proposes a generic scalable framework for solving hard combinatorial problems. Key elements are distributed memory task parallelism (to achieve scale), work stealing (to cope with irregularity), and generic algorithmic skeletons for combinatorial search (to reduce the parallelism expertise required). We outline two implementations: a mature Haskell Tree Search Library (HTSL) based around algorithmic skeletons and a prototype C++ Tree Search Library (CTSL) that uses hand coded applications. Experiments on maximum clique problems and on a problem in finite geometry, the search for spreads in H(4,2^2), show that (1) CTSL consistently outperforms HTSL on sequential runs, and (2) both libraries scale to 200 cores, e.g. speeding up spreads search by a factor of 81 (HTSL) and 60 (CTSL), respectively. This demonstrates the potential of our generic framework for scaling parallel combinatorial search to large distributed memory platforms

    BigraphTalk: verified design of IoT applications

    Get PDF
    Graphical IoT device management platforms, such as IoTtalk, make it easy to describe interactions between IoT devices. Applications are defined by dragging-and-dropping devices and specifying how they are connected, e.g. a door sensor controlling a light. While this allows simple and rapid development, it remains possible to specify unwanted device configurations – such as using the same device to drive a motor up and down simultaneously, risking damaging the motor. We propose , a verification framework for IoTtalk that utilizes formal techniques, based on bigraphs, to statically guarantee that unwanted configurations do not arise. In particular, we check for invalid connections between devices, as well as type errors, e.g. passing a float to a boolean switch. To the best of our knowledge, is the first platform to support the graphical specification of correct-by-design IoT applications. provides fully automated verification and feedback without end-users ever needing to specify a bigraph. This means any application, specifiable in IoTtalk, is guaranteed, so long as verification succeeds, not to violate the given configuration constraints when deployed; with no extra cost to the user

    Replicable parallel branch and bound search

    Get PDF
    Combinatorial branch and bound searches are a common technique for solving global optimisation and decision problems. Their performance often depends on good search order heuristics, refined over decades of algorithms research. Parallel search necessarily deviates from the sequential search order, sometimes dramatically and unpredictably, e.g. by distributing work at random. This can disrupt effective search order heuristics and lead to unexpected and highly variable parallel performance. The variability makes it hard to reason about the parallel performance of combinatorial searches. This paper presents a generic parallel branch and bound skeleton, implemented in Haskell, with replicable parallel performance. The skeleton aims to preserve the search order heuristic by distributing work in an ordered fashion, closely following the sequential search order. We demonstrate the generality of the approach by applying the skeleton to 40 instances of three combinatorial problems: Maximum Clique, 0/1 Knapsack and Travelling Salesperson. The overheads of our Haskell skeleton are reasonable: giving slowdown factors of between 1.9 and 6.2 compared with a class-leading, dedicated, and highly optimised C++ Maximum Clique solver. We demonstrate scaling up to 200 cores of a Beowulf cluster, achieving speedups of 100x for several Maximum Clique instances. We demonstrate low variance of parallel performance across all instances of the three combinatorial problems and at all scales up to 200 cores, with median Relative Standard Deviation (RSD) below 2%. Parallel solvers that do not follow the sequential search order exhibit far higher variance, with median RSD exceeding 85% for Knapsack

    Contributions of Motivation, Early Numeracy Skills, and Executive Functioning to Mathematical Performance. A Longitudinal Study

    Get PDF
    The main goal of this longitudinal study is to examine the power of different variables and its dynamic interactions in predicting mathematical performance. The model proposed in this study includes indicators of motivational constructs (learning motivation and attributions), executive functioning (inhibition and working memory), and early numeracy skills (logical operations, counting, and magnitude comparison abilities), assessed during kindergarten, and mathematical performance in the second year of Primary Education. The sample consisted of 180 subjects assessed in two moments (5–6 and 7–8 years old). The results showed an indirect effect of initial motivation on later mathematical performance. Executive functioning and early numeracy skills mediated the effect of motivation on later mathematic achievement. Practical implications of these findings for mathematics education are discussed

    Reconstruction of Oomycete Genome Evolution Identifies Differences in Evolutionary Trajectories Leading to Present-Day Large Gene Families

    Get PDF
    The taxonomic class of oomycetes contains numerous pathogens of plants and animals but is related to nonpathogenic diatoms and brown algae. Oomycetes have flexible genomes comprising large gene families that play roles in pathogenicity. The evolutionary processes that shaped the gene content have not yet been studied by applying systematic tree reconciliation of the phylome of these species. We analyzed evolutionary dynamics of ten Stramenopiles. Gene gains, duplications, and losses were inferred by tree reconciliation of 18,459 gene trees constituting the phylome with a highly supported species phylogeny. We reconstructed a strikingly large last common ancestor of the Stramenopiles that contained ∼10,000 genes. Throughout evolution, the genomes of pathogenic oomycetes have constantly gained and lost genes, though gene gains through duplications outnumber the losses. The branch leading to the plant pathogenic Phytophthora genus was identified as a major transition point characterized by increased frequency of duplication events that has likely driven the speciation within this genus. Large gene families encoding different classes of enzymes associated with pathogenicity such as glycoside hydrolases are formed by complex and distinct patterns of duplications and losses leading to their expansion in extant oomycetes. This study unveils the large-scale evolutionary dynamics that shaped the genomes of pathogenic oomycetes. By the application of phylogenetic based analyses methods, it provides additional insights that shed light on the complex history of oomycete genome evolution and the emergence of large gene families characteristic for this important class of pathogens

    FAIR Data Pipeline: provenance-driven data management for traceable scientific workflows

    Get PDF
    Modern epidemiological analyses to understand and combat the spread of disease depend critically on access to, and use of, data. Rapidly evolving data, such as data streams changing during a disease outbreak, are particularly challenging. Data management is further complicated by data being imprecisely identified when used. Public trust in policy decisions resulting from such analyses is easily damaged and is often low, with cynicism arising where claims of "following the science" are made without accompanying evidence. Tracing the provenance of such decisions back through open software to primary data would clarify this evidence, enhancing the transparency of the decision-making process. Here, we demonstrate a Findable, Accessible, Interoperable and Reusable (FAIR) data pipeline developed during the COVID-19 pandemic that allows easy annotation of data as they are consumed by analyses, while tracing the provenance of scientific outputs back through the analytical source code to data sources. Such a tool provides a mechanism for the public, and fellow scientists, to better assess the trust that should be placed in scientific evidence, while allowing scientists to support policy-makers in openly justifying their decisions. We believe that tools such as this should be promoted for use across all areas of policy-facing research

    Comparability: manufacturing, characterization and controls, report of a UK Regenerative Medicine Platform Pluripotent Stem Cell Platform Workshop, Trinity Hall, Cambridge, 14–15 September 2015

    Get PDF
    This paper summarizes the proceedings of a workshop held at Trinity Hall, Cambridge to discuss comparability and includes additional information and references to related information added subsequently to the workshop. Comparability is the need to demonstrate equivalence of product after a process change; a recent publication states that this ‘may be difficult for cell-based medicinal products’. Therefore a well-managed change process is required which needs access to good science and regulatory advice and developers are encouraged to seek help early. The workshop shared current thinking and best practice and allowed the definition of key research questions. The intent of this report is to summarize the key issues and the consensus reached on each of these by the expert delegates
    corecore